PoC or GTFO by MANUL Laphroaig

PoC or GTFO by MANUL Laphroaig

Author:MANUL Laphroaig [Laphroaig, MANUL]
Language: eng
Format: azw3
ISBN: 9781593278816
Publisher: No Starch Press
Published: 2017-10-10T04:00:00+00:00


6:3 Gekko the Dolphin

by Fiora

The Porpoise of Dolphin

Dolphin is one of the most popular emulators, supporting games and other applications for the GameCube and Wii game consoles. Featuring a highly optimized just-in-time (JIT) compiler and graphics unit that translates GPU opcodes into vertices, textures, and shaders, Dolphin is able to emulate almost all Game-Cube and Wii games at high speeds on a modern x86 CPU.

Instead of trying to do a detailed anatomy of the entire system, much of which is beyond my current understanding, in this PoC∥GTFO article I’m going to focus on some particularly evil assembly optimizations and interesting bug fixes in the Dolphin JIT from the past two months—some large and dramatic, others small and elegant (or horrifically hacky, depending on your perspective!) But first, let’s do a quick overview of how Dolphin works and some of the biggest difficulties inherent in Game-cube/Wii emulation.

Dolphin’s JIT is superficially similar to a typical PowerPC emulator, but things are not nearly so simple as they appear. The GameCube’s Gekko CPU (and the extremely similar Broadway CPU on the Wii) has a number of particularly odd features that aren’t present on a typical PowerPC.

• A “paired singles” SIMD unit, somewhat similar to 3DNow! but complicated by some of PowerPC’s inherent weirdnesses with floating-point. (32-bit floats are represented as 64-bit internally, similar to x87.)

• Built-in “graphics quantization” registers, which allow quantized loads and stores based on runtime-variable parameters, up to and including the data type to be converted to and from.

• A complex memory layout with mirrored regions and a slew of MMIO features, including a memory-mapped FIFO usually connected to the GPU, but which can also be repurposed for other uses by games.

• The ability to directly access—and modify—the active GPU frame buffer.

• Complex cache manipulation features, such as the ability to enable a “locked cache” and access memory as cached or uncached.

• A floating point unit with its own very unique definition of the word “multiply.”

Making emulation even more difficult, games tend to abuse every aspect of the system imaginable, from the precise rounding of every floating point instruction to self-modifying code to behavior that isn’t even defined in IBM’s specification for the CPU. Additionally, games typically run in supervisor mode, giving them the ability to abuse a wide variety of features user-mode applications can’t. All of this leads to severe limits on the shortcuts Dolphin can take; the most benign-seeming optimization often results in a slew of unintended consequences. Dolphin can’t even reorder memory loads; an attempt to do this resulted in a real game failing because of exception handling semantics not being maintained.3

Yes, there are applications that require precise emulation of MMU mechanics, including post-exception rollback. Yes, there are applications that intentionally try to execute an address of 0x00000001 to trigger a custom exception handler, and won’t run unless this behavior is properly emulated. Yes, there are applications that modify code without properly flushing the CPU instruction cache and rely on the mere hope that the old code will have been since replaced in the cache.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.